Skip to content

Conversation

@jrmaddison
Copy link
Contributor

Only set the gradient norm for cases where the Riesz map is configured elsewhere.

NLS seems to use the gradient norm, even though the optimizer is not fully configured to use the Riesz map, e.g. the identity perturbation at https://petsc.org/release/manual/tao/#newton-line-search-method-nls.

Includes #239, which should be reviewed/merged first.

@jrmaddison jrmaddison changed the title Avoid partial use of a Riesz map Avoid partial use of a Riesz map in TAO optimizers Nov 13, 2025
@JHopeCollins
Copy link
Contributor

From petsc/src/tao/unconstrained/impls/nls/nls.c it looks like TaoGradientNorm is only used for checking the convergence. Should it be used elsewhere too?

Maybe we should have a use_gradient_norm=None kwarg to the TAOSolver that if None only uses it for known-riesz-map-aware methods (e.g. LMVM) and if a bool then forces it to either be or not be used (with some kind of "developer-only" note in the docstring).

@JHopeCollins JHopeCollins added the bug Something isn't working label Nov 17, 2025
@jrmaddison jrmaddison force-pushed the jrmaddison/tao_gradientnorm branch from 624244f to 28d0208 Compare November 17, 2025 11:22
@jrmaddison
Copy link
Contributor Author

It's used elsewhere too, the gnorm variable is reused later in nls.c.

@JHopeCollins
Copy link
Contributor

JHopeCollins commented Nov 17, 2025

Good point, I was mislead by the comments.
Leaving aside the trust region KSPs, gnorm is used for setting that identity shift. In theory should that be a riesz-map shift? (even though I don't think we could do that in practice currently).
Could/should we instead sniff whether the user has set tao.type == TAO.Type.NLS and default the min/max/initial shift all to 0.0? Then the convergence test would still use the gradient norm.

I have zero knowledge of bounds constrained Krylov methods so I can't comment on what should happen there.

In any case, I think that the amount of discussion there has been about Riesz maps + various optimisation methods is telling us that there should be at least a bit of exposition in the TAOSolver docstring with the current-best-guess at acceptable/unacceptable/unknown combinations. I will try and draft something up later this week (although I'd welcome suggestions!)

@jrmaddison
Copy link
Contributor Author

Good point, I was mislead by the comments. Leaving aside the trust region KSPs, gnorm is used for setting that identity shift. In theory should that be a riesz-map shift? (even though I don't think we could do that in practice currently). Could/should we instead sniff whether the user has set tao.type == TAO.Type.NLS and default the min/max/initial shift all to 0.0? Then the convergence test would still use the gradient norm.

Yes, it should be a Riesz map shift. My preference would be to stick to the PETSc defaults where we can't configure the Riesz map, but agreed documentation for this would be good.

@JHopeCollins
Copy link
Contributor

JHopeCollins commented Nov 17, 2025

Yes, it should be a Riesz map shift

Right, so currently shifting is not valid. Although possibly it wouldn't be too difficult to add by modifying what ReducedFunctionalMat.mult does with the shift member.

My preference would be to stick to the PETSc defaults where we can't configure the Riesz map.

Fair enough, I think there's a balance between trying to do the right thing by default ourselves and principle of least surprise when looking at the TAO docs.

documentation for this would be good.

For now please can you add a note to the docstring just saying that the GradientNorm is only set for LMVM and BLMVM, and that Riesz map support is known to be incomplete elsewhere so alternative methods are used at the users peril.

I'm happy to approve with that caveat.

Actually, please can you add the kwarg below so we (I) have an escape hatch to use the Riesz map to check convergence but set the right options to prevent shifting.

Maybe we should have a use_gradient_norm=None kwarg to the TAOSolver that if None only uses it for known-riesz-map-aware methods (e.g. LMVM) and if a bool then forces it to either be or not be used (with some kind of "developer-only" note in the docstring).

@JHopeCollins
Copy link
Contributor

JHopeCollins commented Nov 17, 2025

Sorry to keep nitpicking but please can we have a way to force the gradient norm to be set? Otherwise it'll be impossible to use the riesz-mapped gradient to test convergence with Newton.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants